Sentence Extraction System Assembling Multiple Evidence
نویسندگان
چکیده
We have developed a sentence extraction system that estimates the significance of sentences by integrating four scoring functions that use as evidence sentence location, sentence length, TF/IDF values of words, and similarity to the title. Similarity to a given query is also added to the system in the summarization task for information retrieval. Parameters for scoring functions were optimized experimentally using dry run data of the TSC. Results from the TSC formal run showed that our method was effective in the sentence extraction task.
منابع مشابه
OPTIMAL LOT-SIZING DECISIONS WITH INTEGRATED PURCHASING, MANUFACTURING AND ASSEMBLING FOR REMANUFACTURING SYSTEMS
This work applies fuzzy sets to the integration of purchasing, manufacturing and assembling of production planning decisions with multiple suppliers, multiple components and multiple machines in remanufacturing systems. The developed fuzzy multi-objective linear programming model (FMOLP) simultaneously minimizes total costs, total $text{CO}_2$ emissions and total lead time with reference to cus...
متن کاملCentroid-based summarization of multiple documents: sentence extraction utility-based evaluation, and user studies
We present a multi-document summarizer, called MEAD, which generates summaries using cluster centroids produced by a topic detection and tracking system. We also describe two new techniques, based on sentence utility and subsumption, which we have applied to the evaluation of both single and multiple document summaries. Finally, we describe two user studies that test our models of multi-documen...
متن کاملMultilingual Multidocument Summarization Tools and Evaluation
We describe a number of experiments carried out to address the problem of creating summaries from multiple sources in multiple languages. A centroid-based sentence extraction system has been developed which decides the content of the summary using texts in different languages and uses sentences from English sources alone to create the final output. We describe the evaluation of the system in th...
متن کاملاستخراج پیکره موازی از اسناد قابلمقایسه برای بهبود کیفیت ترجمه در سیستمهای ترجمه ماشینی
Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...
متن کاملMulti-document Summarization System: Using Fuzzy Logic and Genetic Algorithm
In the recent times, the requirement for generation of multi-document summary has gained a lot of attention among the researchers. Mostly, the text summarization technique uses the sentence extraction technique where the salient sentences in the multiple documents are extracted and presented as a summary. In our proposed system, we have developed a sentence extraction based automatic multi-docu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001